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3 A method and sy stern i's^ for synchro- 

nizing exedubon^y^ of threads 

withiri' a f^roces^^^ ^xecutiph of a thread com- ' 
mences. a deterrninatibn'Ts made as toVhetheral^ of- 
the /equired resources for execution of the thread 
are' 'available in a c^che -local to the processing ^ 
elerrient. If the resources are riot aivailable. then the 
resources are fetched from main storage and stored 
in one or more local cafches^ before execution beginsf^^ 
If tFu9 resources Wavail^ble^^ 
thread may begin. During execution of the thread 
and. in particular, an instruction within the thread, the 
instruction may require data in Order to successfully 



complete its execution. When this occurs, a deter- 
mination is made as to whether the necessary data 
is available.Mf the data is :available;,the result of , the 
' instruction execution is stored \and execution -of the 
thread continues. However, if the datat is unavailable, 
then the thread is deferred until -the ^data; becomes 
available and a new thread ;is processed.. -When 
deferring -a thread, the thread is placed in the mem- 
ory location which is to receive the , required data. 
Once^ the^data-^iS' available, ithe thread ::is removed 
from the data location and placed on a queue -for 
execution and the data is stored in the location. 
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TECHNICAL FIELD 



This invention relates !n general to data syn- 
chronization within a parallel data processing sys- 
tem, and more particularly to the synchronization of 
threads within a process being executed by a pro- 
cessing element. 

BACKGROUND ART . ■ ^- ' 



In parallel data processing systems, programs 
to be executed-may be divided^into a .number of _ . . „. . 
processes which may be executed in parallel by a 
plurality of processing elements. Each process in- 
cludes one or more threads and each thread in- /5 
eludes a group of sequentially executable instruc- 
tions. The simultaneous execution of a number of 
threads requires synchronization or time-coordina- ^^ 
tion of the activitiesiassociated: with each thread. 
Without synchronization a processor may sit idle 20 
for a great deal of time waiting for data it requires, 
thereby degrading system performance ancLutiliza- . r, 

A thread located in one process is capablejpif.^^^^v 
communicating with thref ds jn ,anot^^^^ 25 
jn the same process and therefore^:variousJevels, , . 
of synchronization are required in order to have an 
efficiently execufirfg^systen^ 

system performance. -^^sjoo ia f ? ^ 

In order to synchronize the communication of " 30 
threads located in different processes, a synchro- 
nization "mechanism, such asM-structures may be 
used.' l-structures are used in main storage and are 
described in l-structures: Data Structures for; :Par- 
allel Computing by ' Arvind,:^ B.S. Nikhil and . K.K. 35 
Pingali, Massachusetts Institute of Technology Lab- - 
oratory for Computer Science. February ,1987. e;, o : 
Synchronization of threads communicating be- . , 
tween - different = processes, does :not.onegatej^,the. ; 
need- for -a i'synchronization i mechanism/ usfed^ tp ^ 40 
syrvchronize vthreads within ^dthe v^samej^process. y-^o 
Therefore, a need still exists for an efficient manner 
to synchronize threads within a process thereby 
providing greater system utilization and perfor- 
mance. A need also exists for a synchronization 45 
mechanism of threads within a process wherein the 
synchronization mechanism is local to the process- 
ing element in which the threads are executed. A 
further need exists for a synchronization mecha- 
nism which does not place a constraint on the so 
number of processes and threads which may be 
executed by the processing element due to the 
size of local memory. 



DISCLOSURE OF INVENTION 

The shortcomings of the prior art are overcome 
and additional advantages are provided in accor- 



dance with the principles of the present invention 
through the proviJsi6n"<>f^a^ method and system; for 
synchronizing threads within a process. ' ' ; ' 
In accordance with the principles of the present 
5 invention, a method for synchronizing execution by 
a processing element of threads within a process is 
provided. The process includes fetching during ex- 
ecution of a thread within a process a datum field 
from a local frame cache and an associated state 
10 indicator from a state bit cache. The state indicator 
has a first state value which is used to determine 
. whether the datunrt includes a datum available 
for use by the thread. If tTie datum is unavaiiaBle/* 
then execution of the thread is deferred until the 
datum is available. 

In one embodiment, the thread is represented 
by a continuation descriptor and the step of defer- 
ring the thread includes storing the continuation 
descriptor within the datum field. 

In yet another embodiment, the method of syn- 
chronizing threads includes awakening the deferred 
, thread when the datum is available for the thread. 
' Awakening Jncludes removing the continuation de- 
scriptors stored in the datum field and then placing 
I the datum in the field. ^- 'f<iv\v^#Sf^^^ 
I in another aspect of the-inveritioh^^ systerTiTfoh^^ 
j synchronizing execution by a processing element ^ 
" of threads within -a process is provided. The sys- 
.^tem includes a local /frame cache and a state bit 
cache, meansTor executing 'by ^ t 
ment a thread within a process and means for 
fetching from the jocial, frame cache a datum field 
and from the state bit cache an associated state_ 
indicator...The state indicator has a first state value 
and ; the , system includes^ means T^^^^^^ 
based on the first ,state .value whether the fdatuinh 
field includes a datum:.available for use ; by the ^ 
thread. Should the datum be unavailable, then 
means for , deferring execution of the thread until 
the. datum is available, is provided. , ' / ' ' " 
^ ■ In. one^ embpdiment;.^!^^^^^ in- 
cludes means for determining a second state for 
the state indicator wherein the second state will 
replace the . first state during execution of the 
thread. The first state may indicate a current state 
and the second state may indicate a next state. 

In another aspect of the invention, a method for 
synchronizing execution by a processing element 
of threads within a process is provided. Each 
thread includes a plurality of instructions and the 
method includes executing an instruction. During 
execution of the instruction, at least one source 
operand is fetched from a local frame cache and at 
least one corresponding state indicator having a 
55 first state is fetched from a state bit cache. Also, 
fetched from the instruction is at least one state 
function associated with at least one fetched 
source operand. The state function is used to se- 
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lect from gone, ofsta pJurgity.pf Ktebles j^fH eossibj^.. ^ 
second states for the state indicator wherein each 
of the second states has an associated flag indica- 
tor. The first state is used to choose from the , 
selected N possible states a second state for the 
state indicator and the second state replaces the 
first state during thread .execution. The flag indica- 
tor specifies one o.f,.a:,plurality:, of .actions for , the 

thread .to perform. .^isDO'ia ent 5*.r" (rsiF.r.i'^A : -<■■ 

In accordance with tf)e principles of the present . 
invention; a method and system^Qj.synchroni^^^^^^ 
threads within a process is provided. .The synchro-,^ 
nization mechanism of the present invention sus- 
pends execution of a thread when data for that 
thread is unavailable thereby allowing anottier 
thread to be executed. This provides for increased ^ 
system utilization and system perfprmariqe.^,,,. 

BRIEF DESCRIPTION OF DRAWINGS . 

The ^subject •mattej^which is^gre 
invention is particularly pointed out and distinctly^^ 
c1aimed.jnr,the?iClaims at Jhey^pnclusion.^d :^the,j.^ 
specificaUon. flh§et9.ceg9in9iand 9ilier0Ob^ 
i.tures«andaadv^tagesKOjghy.nvet5tip5^^^ 
parent from^ the jfpllpvyingriielane^^^^ 
in conjunctipn;tw,ith i*eifccpmpanxi,9.g ;,drawin^ in,,^ 

which", iviw fl! .•ifinft&rei.\gf<}..«st3jfi ^^3imv,8f^(J;K^^i)|j-. 

depicts .one .„exampje^pf,, a ^blpc^^^^^^ 

parallel -processing system, in accordance with 

the principles p.f. the. present invention; .^.^^^ 

FIG. 2r-.- p'V^i;ke"^''*- s !3 >'f-:!-^. i:.,-.fv:;- - -.•';5s: 
is one example,of.:the3log.ica! cpmponents,,assot, 
ciated 'with a.- maUi memp.n<:,control upit of . the 
parallel processing system of FIG. 1, in accor- 
dance with the principles of .the^ presen^t inven-: 

FIG..3^,;rf-.^ ,Ov-ir"v ;wi-r:-! XKv- '<^')<> t.. 

is an.illustration^pf .pnejemto^^^^ 

local frame residing in the main .memory control 
unit of FIG. 2. in accordance with the principles 
of the present invention; ; ;J 

FIG. 3a . 

depicts one example of a logical work frame 
associated with the local frame of FIG. 3. m 
accordance with the principles of the present 
invention; 
FIG. 3b 

illustrates one embodiment of the fields con- 
tained within a non-compressed continuation de- 
scriptor of the present invention; 
FIG. 4 

is an illustration of one embodiment of the en- 
tries of a logical code frame residing in main 
memory control units of FIG. 2. in accordance 
with the principles of the present invention; 
FIG. 4a 
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depicts :one exarnpleir:crf^ 

instruction located in the code frame of FIG. 4,^ v 
in accordance with the principles of the present 
invention; ^ _ 

FIG. 4b ' ' ^ ' .u 

depicts one example of the components of the 
destination and source specifiers of the instruc- 
tion of FIG. 4a, in'accordance with the principles 
of the presieint -invention; ' ^'^ ^ ' ' 

FIG. 5 . / . . 1 . 

illustratbs one 'embodiment of a' block diagram ^ 

of W'h&We%)mpdHe^ ^^df/^a^^processing 
element of l^GJ'lf in with the princi- ; 

pies of the present invention; 
FIG. 6 

depicts W exampie'bf m^^ 
ready queue entry within a ready queue ' de-" 
picted'ih FIG. 5,^in accorx3arice with the princi- 
ples Of the present invention; 
FIG. 7 
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' -^i^i^^miiivm^&oc&ssmg element ^om^ 
^■tf^$^4sii^if^i^ntibh^^'>«?»^«^*:^^ 

d#dts^%i^''&pi^Sf^=a'f&6ye^^^ 
dWectory associated with the code frame cache 
o^ Flb'^'a: in accordariBe with ' the ':^prihciples- of * 
: the present inyer)tion;r.3 ,::^ :^ > vS: > i ■ 

depicfohe example of a'block^cJiagram of the 

cornpohdHts associated WitK a^local frame cache - 

located within the - processing element depicted ' 

in Fl(i. 5; in accordance with the principles of 

' th^ present ihventiori;'4^. ^ - ^, ' ^ ' 

FIG. n V * . / . 

depicts one example of a local frame cache 

directory associated with the local frame cache 
of FIGS. Ida. 10b. in accordance with the princi- 
ples of the present invention; 
FIGS. 12a, 12b 

depict one example of a flow diagram of the 
synchronization process of the present inven- 
tion; and 
FIG- 13 

depicts one example of a flow diagram of the 
processes associated with writing data to a loca- 
tion in a cache, in accordance with the princi- 
ples of the present invention. 
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BEST^.MQDE. FOR CARRYING OUT JHE INVEN- , 

The synchronization mechanism of the present 
invention resides within a parallel data processing 
system 10. as depicted in FIG. 1, In one embodi- 
ment, parallel data processing system 10 includes 
one or more main memory control units . 12. a 
plurality of processing elements^^^(PE) J4 and a 
plurality of input/output processors (I/OP) 16. Pro- ' , 
cessing elements l4,communicate^ with, each other, ' 
main {;}memory.:iControl ^units ^^IZ^.and^^ing^^^ 
processors 16 through an interconnection . network 
18. One example of the main components asso- 
ciated with main memory control units 12 (or main 
storage) and processing elements 14 are explained 
in detail below. . / 

Each Qf nr)ain memory control units 12 contains 
a portion, of a sequentially. .addressed linear, mem- 
ory address space (not shown). The basic unit of 
information stored in . the,, address space is a word 
(or Hiemorfy^Jocatipn), ^having a..^uriique,^address 
across ;al!,jrTiairj..fmemorvr,c^^ Contiguous' 
words or memory locations may be .combined into 
a logical structure such as a local frame 20 JFIG.^ 
-2) r a ^:eode if rame,t22 ^ and ^a >wprk Jrame ^3 ejn one ■ v 
embodiment, j^iocal^ frame 20 ^and^work^. frame 23 
genetally^rete& to, a. group, pf .data^^gjs^ o^de ; 
frame 22 refers to a group of Jnstructions.^There. 
may be a plurality of local frames, work frames, and. 
code -frameSi within,^main memory. control units 12/ 
In one ■embodiment, a particular Jocal , frame is 
associated with a. process such that tfiie address of 
a local frame is used as the identifier of a. proc 

Referring to FIG. 3. local frame '2o'Has^^^ 
example.) 256, .locals-frame locations 24. fthe first 
four^ilocations are reserved for .an. invocation con- 
text map., entry 26, which js associated with a 
process to, be. executed^ by drie of processing ele- 
ments 14, the next two slots J^are' reserved for 
functions not discussed herein and the remainder 
of the locations (six through 255) are reserved for 
data local to the process. The information con- 
tained within invocation context map entry !;26 is 
established prior to instantiation of the process and 
includes, for instance, the following fields: 

(a) A three bit state indicator (ST) 28 which 
indicates the current state of a local frame loca- 
tion. As described further below, state indicator 
28 is relative to a state function, which is used in 
accessing the frame location; 

(b) A twelve bit physical processing element 
(PE) number 30 which identifies which process- 
ing element is to execute the process; 

(c) A three bit process state (PS) 32 which 
indicates the state of the associated process. A 
process may have any one of a number of 
states including, for example: a free state, which 



is used as a reset state to indicate that the 
process'ls nb ^longer ^'active;* ah inactive tstate^i';i& 
used to^preVent'''a'^*pr^^ from executing; ^a ' 
suspended state, used to prevent any modifica- 

5 tion to a process so that, for example. the op- 
erating system may perform a recovery; an ac- 
tive state in main storage, used to indicate that 
the process can execute' and that it is main 
memory; or an active state not in main storage. 

10 used to indicate that the process can ' execute -rn 
and It is within the processing element assigned 
to execute 'the; process;' " ^ ^.^^:^^}^^} ^^^-,^^0^^:^^.^ 

(d) ' A twb'^^ bit 'locar frame state (FS) 34^ which ^^. ^^ 
indicates the state of local framia 20. Local frame 

/5 20 may have, as an example, one of the follow- 
ing states: 

a present state representing that the local frame 
is present in main storage; v 
a transient state representing that the local 
20 frame is transient between main storage * and 
memory local to the processing element, which 
is iderftifieS " by-^brocessing element number 30; 

an absent^"" state '^ihd icati ng that ref erencesa to 
25 Icc^i^ frarne'^2tf Jare!^1^ 

cessmg element's ilocairmemory,^ias| 
by , physical procj9ssirig''elenrjerUfjntjml^^ 

(e) ' A%ne' bit QWv^ ^context ^eue^fcoritrob^ 
(ICQC) 36 which indicates the nriannQr jn whic^^^ 

30 the process is enqueued onto an invocation bori-T 
text queue (described beiowrat/instantiatioii;^^^ 
(fV A bne^bit tache pinning control' {€PC)'^^^^ 
which indicates whether a code frame or-af local 
frame located within the processing element 

35 (e:g,'. 'within a dfide ffarrie cache'or^aMdcal fi'ame • : . 
cache,^ which 'is described below) 'iis^^tbi^beK 
pinned- ' '-tnc^ ^e^fe-:^.-; 

(gj'An eight bit local continuation queue head' 
(LCQH) pointer 38 which contains an offset into 

40 a first entry of work frame 23 (FIG. 2) which is in 
contiguous^ rri emory ^ locatibhs to local frame 20 • 
(described*beldw)f ^ " ' - - ^>^■ ■ ' '^'^ > 

(h) An eight bit local continuation queue tail 
(LCOT) 40 which contains an offset into the first 

45 empty entry at the end of work frame 23; 

(i) A forty bit invocation context queue (ICQ) 
forward pointer 42 which is used in creating a 
doubly linked list of processes (referred to as an 
invocation context queue) in main storage for 

50 the processing element identified by processing 
element number 30. The invocation context 
queue has a head and a tail which are retained 
within the processing element, and it is ordered 
based on the enqueue discipline indicated by 

55 invocation context queue control 36; 

(j) A forty bit invocation context queue (ICQ) 
backward pointer 44 which is also used in creat- 
ing the invocation context queue of processes; 
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forty bit code frifne pointer 46 which 
specifies the address of a current code frame 22 

^ AS pmviously stated, locations six through 255 s 
of local frame 20 are reserved for data. Each data 
entry 48 includes state indicator 28 and a 64-brt 
datum field 50. Datum field 50 contains the data 
usedbUringWutioh of the process.. rns ; , , ^ ,w 
Referring once again-^o FIG. 2.ccoupled to ro 
eachloca^frame 20 is a logical structure referred .nc.,e 
toVs Work frame 23. work frame 23 is alto^^^^^^ : 
the next 256 contiguous locations to local frame 20^ 
Local frame 20 and work frame 23 ^^"'^^^^'l^l ,^ 
a Single entity. The first. ^^^^^^^^^^Z 
entries or lbbations 52 of work frame 23 ^ 
include bne or more compressed cont.nuat.on de,,,, 
scriptors (CCD) 54 "sed. ^as explained belovy m , 
selecting ah instruction to be executed. or date - 

which is to be used by the instruction.. Each cp^n. . . 
pres'sea'cWntitiuatioh'=descriptor354Uincludes:.kf2^^ 
instancg- a '^c6de ' bffsetn'56 Isand san.einde><.f>58 ^.^^ 
(des^rifi^d^ bel6w)v lh>^c6ntrast. a?3continuation^^e- 
scriDtof'^lii'cK'as' not! comjiressedsalso <inc!Hdes: a.m^. , 
v.oci 'fr^^i»intert60'i^(FIG^ ,3b).^hidv. indicate? ;.4t.25 

the bep^ihro' icfesi^fr^ ^om^cornp.;ps^ 
continltioadescriptbr^oes^not:need.t<^ 
locaPf^^e pointerb^ince^it^mayjb^jnferred,^^ 

frame-pair irv^bne embbdimenticeaph locat.pp52 in 30 
worl^^' frame -!23 is capable • ofa:storing- four^conn- . 
Dres^'ed''antinuatiori'descriptors.>^ 'ft-t^m ybeo-. u, 

'Referring^ again to^FIGr 2. the local :fram|^;,.-„ 
wofk'''^frame5'>air^Ms'''coupled'> ito i.code'.frame ,,22,.,. „ 
rlgl^ocJe^ ffame poimer 46 Of invocation cont^ . 35 
mab'^entrV 26- • e'mbedded withinv; local ■ frame,. 20,,^, 
C0& frame^22 •includes;^^for iristance. :256,cocle. 
frame locations 62 (FIG. 4) and each.lopatipn .n,,, 
duals' a*^ Word-sized instructipnx64.or,eafl_^^^ ^ . 
coSsWt (hW shoWH).swrtich' is.associated,Mt»i, th^ 
prdibes^Mo be executed by processing -element ^14 
as indicated by processing element number 3a 
Subsequent to loadirig the instructioris or constants 
(data for constants are stored at code frame gen- 
eration) into the code frame, the code frame is 4S 
immutable and thus, may be shared by other pro- 
cesses and processing elements. In one embodi- 
ment code frame 22 Is managed in main storage 
in sequentially addressed and contiguous group- 
ings of sixteen word blocks. This allows for efficient so 
transfer of code frames in main storage to memory 
locations local within the processing element (e.g.. 
code frame caches, which are describ^ below). 

Referring to FIG. 4a. instruction 64 is or in- 
stance 64 bUs in length and includes the following ss 

''^'?a) An 8-bit instruction operation code 66 which 

specifies the operation to be executed by the 



orocessing elernent. . Thei- operation code con- - 
te^bki^\inits and instructidn'.- 
sequencing. In addition, it also controls network 
request generation; 

(b) A two-bit thread control (TC) 68 which speci- 
fies the sequencing controls for the current 
thread and its successors (a process includes 
one or more threads bf executable instructions) 
within' the :p^9^ing^>«enr,ent in which : he 

Threads a^" ^?.ng ^^execute^ 
may be, lor example.- sequential instruction d.s-. * 
S;:p.i|^ntivg suspensive submbdebr em^ 
thread.'each of Which ^re described here.n. j ■ ^ 
Sequential instruction dispatch is the mode o 
execution which entails sequential dispatch of 
ttieinstru^ior^^e a, thread J3eir,g 

the prbcessirig^eiement. ; ■ ' ' 
Preventive sijb'iiSnbive^submode causes suspen- 
sion of ttiecUrrent thread at initial dispatch intO'-> 
the processing element of an instruction within 
.. I.- eU«..iH thft' instfuctioh' execute suc- 




End of thread ^indicates to the processing ele- 
;^^^nnhaf'th^"cu?rentt'»ir^^^^^^ 
ii6^ bf'this insfrudtibni-When^ termination:.of ahe , 

ti^'^ ii^'y^yat^^' wi)r^^^^^ 

swfcs tc)1he ne^r^vai^e thread to be tex- 
ebi^d.'^Thi^'rthraad-rWay , be -from the Same 
•process' or^'a higher' priority process, which is 
.enqueued ilFO (Last in-First out) after .n.tial 
aisiatbh th^burrent process into the process- • 

ing element. ; V , _ . ; . , , .^n 

(c) A two-bit'index increment control (X) 70 

which controls the increment of the value of 
' index 58 (FIGS: 3a. 3b) ih^he current continu- 
ation descriptor. When index increment control 
70 is set to a nonzero value, index 58 is updated 
after execution of the current instruction and 
prior to execution of the succeeding instructions 
in the same thread. 

In one example, index increment control 70 may 
mdicate no change in the value of index 58 from 
this instruction to the next or it may indicate the 
incremental value is plus one or minus one; 

(d) A sixteen bit destination specifier 72 is an 
address which indicates, for instance, the target 
of the instruction execution's result; and 

(e) A sixteen bit source operand 0 specifier 74 
and a sixteen bit source operand 1 specifier 76. 
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Source operand specifiers. 74, 76 are addresses 
which enable source pperands tp be; obtained * - 
for the execution functions within the processing ' 
element. 

Destination specifier 72 and source operand 
specifiers 74 and 76 each contain the following 
fields, as depicted in FIG. 4b: 

(a) A four bit addressing mode field (AM) 78 
usedjto encode the various sources (including, 
for example, signed llteraK .indexed signed lit- 
eral, i local frame ^cache^ ^described . below) or ' 
indexed local frame cache)., fpr the instruction ^ 
operand and destination specifiers., ^^AdBre^ing'^T^ 
mode also encodes whether indexed operand 
addressing is to be used. 

(b) A four bit state function field (SF) 80. In one 
embodiment, instructions ^accessing" locations 
within a local frame cacfie (described further 
below) include for each source operand specifier 
and the destination specifier a state function 
used in , indicating the. synchronization function 
being used by that specifier. In accoVdance'with ' 
the principles of .the prese^^^^^ 

of synchronization functions , may be supported 
and."ttherefore.sthere tis ^a,. state, function asso- ' 
ciatediwith each offthe^^available synchronizing 
functionsff Each ^state .function allows, for exam- 
ple.ctwosioterpcetations..one for read access 
andf one, :for-iia-,write access./^Examples. of the 
synchroni2ing,ffunctions.^yyhich. may be support- 
ed by the present invention Tnclude:''bne-Time 
Producer.. Multiple Consumers (OPMC), which is 
similar to l-structures-and has,a write once jDrop- 
erty.iJt refers, to. the, productToh of a data value 
which may be used by,a.number of instructions: 
and Multiple Prpducer.j^Singte Consumer, which 
refers tOiithe production.^of. several ciata values 
used by one thread. In one embodiment, the 
resulting; actions may,.be dependent on the state 
function, applied.. the.xurrent state, 'of the local 
frame location, and the access type, read or 
write, as described in detail below. • i ' ■ 

The state function field is ignored when ad- 
dressing mode 78 selects, for example, a literal 
operand. 

(c) An eight bit frame offset 82 interpreted as an 
offset into one of the accessible local frames 
within a local frame cache (described below) or 
as a literal operand, as selected by the address- 
ing mode of the source/destination specifier. As 
explained more fully below, when frame offset 
82 is used as a frame offset it can be used 
directly or it can be initially added to the value 
of index 58 from the current continuation de- 
scriptor, modulo 256. In one embodiment, it is 
then appended to local frame pointer 60 in- 
dicated by addressing rhode 78 and a local 
frame access within the local frame cache is 



attempted under control of state function 80, 
described In detail below. 
Eacfi code frame 22 within main memory con- 
trol units 12 may exist in a state of absent, or 
5 present. These states exist and are managed by 
software. An absent state indicates that the code 
frame is not in main storage and therefore, a pro- 
cess requiring the absent code frame is prevented 
from being instantiated. A present state indicates. 
10 that the code frame is present ip main storage and ,* 
■therefore, an inpage "request from a processing 
element may be serviced-^Once the code frame is 
in this state, It renriains In this state until the frame 
is no longer required and it is returned to free 
75 storage under software control. 

Referring once again to FIG. 1. main memory 
control" units 12 are coupled to processing ele- 
ments 14 through interconnection network, 18 (FIG. 
1 ). In accordance with the principles of the present 
20 invention, one example of the hardware compo- 
nents aissociated with each processing^element 14 
are depicted in .FIG.:5^and include rtljejpllowing: a 
ready^ queue 84, a local {continuation ; queue 86. , a 
code%ame cache 88, a local frame , , cache ..90, , a 
25 state bit teache 9l.^ahd.ianiexecutiGn unilf92. Each 
^"of thesd Gohnporients^^a^^^ 

: ReadyHiQeue W(is fpri^xarnpie;^a^funyj!as 
ative''meifibry3i structured ^ssentiallyj^as^^ 
that' is^'dapable 'Of * ^ head dr 

30 the tail depending 'onsthe)enqu^ua 

ready ^queue :as rispecifiedfibyjoinypcatjon 
queue' (fontroi 36.' Ready queue, 84 includes a. num- 
ber of ready queue entries,;94 .(FIG..|i6),cprresp^^^^ 
ing to processes or invocations, to be ^executed, b^ 
35 probessing* element i1 4:^ ^lrji,rtpne,n instanGe;l^ready , • 
' ' queue' 84 includes sixteen ready queue, entries., As 
depicted "in FIG. 6 and described herein.,, each 
ready-^ queue entry 94 includes.t for exarpple, the 
following fields: " ' r . 

40 Xa)^A three bit ready-queue (RQ)^state 95 used 
^ ftS' indicate Jithe ^current state. |pt^,a^jeady; q 

entry. Each rready queue entry, may be in one of . 
a number of states, including, for instance, the 
following: empty, indicating that the, entry, is un- . 
45 used and available; ready, indicating that the 
process is ready for execution by the process- 
ing element; pending pretest, indicating that the 
process is awaiting pretesting (described further 
below); prefetch active, indicating that the re- 
50 quired resources for execution of the process 
are being fetched (this will also be described in 
detail below); sleeping, indicating that the ready 
queue entry is valid, but no threads within the 
process are available for dispatching to the pro- 
55 cessing element; and running, indicating a 
thread from the process represented by the 
ready queue entry is being executed by the 
processing element; 
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(b) ..A^Jocal frame ^^pointerj^9& for in- j ' ^ ^ 
stance/ in acceWng local frame cache 90. 
which as described in more detail below, in- 
cludes the data required for process execution, 

In addition, local frame pointer 96 is used in 5 
determining whether a local frame exists within 
the local frame cache. Local frame pointer 96 is 
a copy of -local frame pointer^jBO and is loaded at . 
the time jhat the ready quj^ue ^ntiry isinied in; ^ 

(c) .A local frame cache physical pointer 97 is an io 
address Jntp local ^ame cache 90;^^; ^ 

(dj A three biV loc^^ frame .cache state 98 is 
used to indicate the current state of a local 
frame in local frame cache 20. A local frame 
within the local frame cache may have a number ts 
of states includingr tor example: empty, indicat- ^ 
ing that the frame state for that local frame is / 
unknown or not present; transient, inclicating the 
local frame is currently being. inpaged from main 
memory control units 12 to Ipcal frame cache 90 20 
in processing ^elernent 14;' and present^^ indicat- ^ 



cache90; 



(eV. A code frame pointer 99 is used in accessing " ■ 
code frame cache 88. .Code, frame pointer 99 is . 25 
a coDV of code pointer 46 located m-local frame 

used,4o^,address, a^^blpck jrjstruct.^ in ^cod^ . 
f rame caphe §8, ^5 ^esprjl^ r^low; ; 30 

(g) A three bit code frartie cache state 101 is 
used to determine the current state! of a code 
frame within code frame, ^cache^ j^^^ 

frame jTi^^ have a^ n . 
foV example:! empty, indicating -tfiat tK^^,^^^ ^35 
state for ^a particular code frame is J unknown or ; 
not present; transient, indicatih the code ifram^^ 
is^ currently t>eing inpaged from main rnemory 
control units ! 12 Jo' code , frame "^ca^^^ ;in 
processing elemental 4;. an^pres^^^^^^ 
the code frame is located jn code frame cache 
88. 

(h) A . local continuation queue head pointer 102 
is located in .each ready queue entry and is 
used, as described more fully below, in indlcat- 45 
ing the head of the list of threads for the particu- 
lar process to be executed within a processing 
element. During processing, as described below, 
local continuation queue head pointer 102 re- 
ceives its information from local continuation so 
queue head pointer 38 located within invocation 
context map entry 26 of local frame 20 which is 
associated with the process to be executed; and 

(i) AJocal continuation queue tail pointer 103 is 
located in each ready queue entry and is used, 55 
as described more fully below, in indicating the 

tail of the list of threads for the particular pro- 
cess. Similar to head pointer 102, local continu- 



atibri %ueue ^tol ^ F^^ received fronri 

locaf franfTe 20?lri particular!' ciurihg^'enqu^^^ Into ^ 
the ready queue, local continuation queue tail 
pointer 40 in local frame 20 is copied into ready 
queue entry 94. 
Associated with each ready queue entry 94 is a 
local continuation queue 86 (FIG, 5). Each local 
continuation queue is. for example, a first in-first 
out queue wherein the top entry in the queue is the 
oldest. In * general . local continuation queue 86 con- 
tains ail of the pending threads or continuations 



associated 'wlth^^a' pro wfiich'^is' on ^tha residy '''^ 
queue. The local continuation queue head and tail 
pointers located in ready queue entry 94 indicate 
the valid entries in the local continuation queue for 
the particular ready queue entry. Depicted In FIG; 7 
is one example of local cbntinuatiori queue 86i 

Local corttihuation queue'86 includes a niiriiber 
of local continuation queue entries 104. in which 
each entry represents a pending thread for a par- 
ticular process. Each local continuation queue entry 
104 cbhtaiiis a 'compressed cohtinuatioii descriptor' ; 
including :^a3^ ''^^'t 



tlon ^ithiifr'^a coSe^ffame'^oca^^ 



_ ateci7ln^x 

cachi'^^^ancP mdex 1 06; is ^^etf -tl un ^ -i ndexed ' ' ^ ^ 
addr^^irig fo^alter the ^aiue'bf 
^locate bata'withia framb%che:90?;Jf^^^ 
Ldcai cbntinuatibn queu^^^ 
frame cache :88 via code frame cache physical 
pointer^ipo^ as described iri'detail herein!fleferring 
to FIG^^^'tbde frame>ciach€^ 
" example, 128 code frames 108 and ^^^^e^^^ 
frame includes, e.g., '256 instructioris; In one em- 
bodimeriC' the fcode frames-* Ibc code Irame ' 

cache 88 are inpaged from main memory control 
units ' 1 2 to code ■ frame ciache 88 during a prefetch 
^!-stage;' tlescribed below?'cb^^ 
ports two simultaneous access portsi'^a'^'read 'port ^ 
used In fetching Instructions and a write port used 
in writing code frames from main storage to the 
code frame cache. ' ' " ' ' 

In order to locate code frame 108. code frame 
pointer 99 located in ready queue entry 94 of 
ready queue 84 is Input into a code frame cache 
directory 110 In order to obtain a value for code 
frame cache physical pointer 100. In one embodi- 
ment, code frame cache directory 110 is organized 
to allow an 8-way set-associative search. 

Referring to FIG. 9, code frame cache directory 
110 includes, for example, sixteen rows and eight 
columns and each column and row intersection 
includes an entry 114. Each entry 114 includes a 
code frame address tag 116 and a state field 118. 
Code frame address tag 116 is. for example, the 
upper thirty-six bits of the 40-bit code frame polnt- 
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er 99 and is.used in determining the address value 
of code frame cache .physical, pointer 100. State ,:i 
field 118 is a three-bit field used in indicating the 
state of a particular code frame 108 within cpde 
frame cache 88. A code frame within the code . s 
frame cache may have one of the following states: 

(a) An empty state which is defined by an un- 
successful attempt within a processing element 
to locate a^.particular code frame within the code 
frame cache. This state is proper when the code J io 
frame exists only in tmain storage or within an- .^,^ 
other,(processing .element.. The j^enripty^^^tate/^ 
recorded in .the code frame . cache at systenh 7 
Initialization and whenever a code frame invali- 
dation occurs. . . 15 

(b) A transient state which applies to a code 
frame wtien it is in a state of motion. For exam-^ 
pie, the code frame is being moved from main 
storage to the code .frame cache within the , , 
processing element . (an inpage operation). Dur- ^ 20 
ing . inpaging. (,jOne^ pt,.two^.^^ 

states .may be recorded for.,the frame, depend- 
ing on the.desired final ^state, of the .code frame . 
at inpage completion. The state is, recorded, as ^ . 
transient-final state.rtWhere, final state may^be.the ...,^25 
- ' presentn; State j^iori^^a^^ 

(described..v^ below)i^^^prL4;>;pirin J-,. 
pretest/prefetch.Jnp^age .>vith. jnvocatioa,.conte)ct,,; 
maphentry..cache pinning , controL37^as active. (. ^.^ . 
The transient state .of a .code.frame.in.;the'code, ^ ao 
frame cache ..prevents .selection.^of, the code 
frame by. a. cache replacement^ algorithm., such 
as for ,exampje. a^^least recently ..used ^(LRy) ' ■ _ 
algorithm., theretb^^ 

of^the . inpage operatiorT ^^.^. 7 "i"^ ' ^ ^C^^r 

(c) A present state which indicates that the con- J* ^ 
tents of . the desired code Jrame are^entirely " 
within code frame cache 88.. When the code; 
frame is in this state, ,then .processing. element ... ^ 
1 4 .mayo ietch the.. instructions.^located . Jn^' cpde ^ ' 40 



bits 12-47 of code pointer 46. If a match is found, 
then the address value of t^ie code 'frame cache 
physical pointer is obtained.' In particular. ' the ad- ' 
dress of pointer 100 is equal to the row identifier 
(i.e., the four rightmost bits of code frame pointer 
99) and column identifier, which is the binary repre- 
sentation of the column (i.e., columns 0-7) in which 
the match was found. , 

Subsequent to determining code frame cache 
physical pointer 100. the ^physicar is used in 

conjunction with code offset 105' located in local 
continuation queue 86 to locate an {instruction^ 120 : 
within^ code frame V08. 'in orcie a 'particu- 

lar instruction 120 within code frame 108. code 
frame cache physical pointer 100 is appended at 
122 on the left of code offset 105 located in local 
continuation queue entry 104. ' ' ^^^^ 

In one embodiment, instruction 120 'includes 
the following fields which are loaded from the copy 
of code frame 22 located within main storage (the 
following fields are similar to the instruction fields 
described with Vefererice' to^ FIG.^4a;' and therefore,* 



some bf the fields are ^ not 'describe^^^^ detail at this ' 
point): an Operation code (OP CODE), 124. a thread 




35 



frame cache 88,^ 



(d) A pinned state which also indicates that the 
contents of the desired code frame are entirely 
within the code frame cache. However, if a code , _ . 
frame is. marked as pinned, then replacement of 45 
the frame during pretest/prefetch is prevented 
(described below). In order to remove a pinned 
code frame from the cache, explicit software 
action is taken. 
Address tag 116 is used in conjunction with 50 
code frame pointer 99 to determine an address 
value for code frame cache physical pointer 100. In 
particular, the four rightmost bits of code frame 
pointer 99_ (FIG. 8) are used to index into one of the 
rows within code frame cache directory 110. Sub- 55 
sequent to obtaining a particular row, the contents 
of each code frame cache address tag 116 within 
the selected row is compared against the value of 



fier 130.sPestination specifier 126 indicates , the:' 
address in which Jhe result of the instruction execu- 
tion IS to be wntten and. the source operand specif 1- 
ers indicate the addresses of the data opterands 
located in local frame cache 90 to be read and 
used during execution of the instruction. ^^^^^ ■ ' • 

Code frarne cache 88 is coupled to local frame 
cache ^6, "as deschbecl in^Setair he^^^^^ 
to FIGS. 10a. 10b. Ideal fraryie'cache 90 includes,^ 
for example. ' 256. loc^^^^^ frames ^'!{1 31 each : 

frame includes^'256 data^ wpricls' t?^^^^ 
tion context queue information, destination location. 
^ source ojDerands). In one embodiment, local frame 
cache,90 is organized' into eight parallel word-wide 
banks. Each local frame* 131 spans across all eight 
banks such that each bank stores thirty-two words 
of local frame 131. In one example, the first bank 
(bank 0) holds the following words of local frame 
131: word 0. 8, 16, 32, 40, 248 (i.e.. every 
eighth word of the local frame); the second bank 
(bank 1) holds words: 1, 9, 17, 33, 41, .... 249 etc. It 
will be apparent to one of ordinary skill in the art 
that this is only one way in which the local frame 
cache may be organized and the invention is not 
limited to such a way. Local frame cache 90 sup- 
ports two simultaneous access ports, a read port 
and a read/write port (not shown). The read port is 
used for fetching operands and the read/write port 
is used for storing results from instruction execu- 
tion and for deferring continuations, as described 
below. 
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In one.embpdiment..the;l^^ locatedin,^^^!^^ 
local frame cache, 90 are inpaged fronj^main mem- 
ory control units 12 (i.e.. datum 50 is inpaged) to 
local frame cache 90 during a prefetch stage, de-^ 
scribed t>elow. In order to locate a local frame 5 
within the local frame cache (so that inpaged in- 
formation may be written to a location wijhin the 
local frame or so that information may be read 
from a particular. location). Jocal frame j^pointer j 96^ ^ 
located in ready queue entry 94jis input into^a local , , to 
frame cache directory : 133 in order tot.obtaa ,^ 
address value for local frame cache pfiysical point- ... . 
er 97 located in the ready queue entry. (In another 
embodiment, it is also possible to obtain the local 
frame cache physical pointer during pretesting rs 
(described below), .thereby, elirninating the process^,. , 
for obtaining the pointer address from the cache , 
directory.) In one embodiment local frame cache 
directory 133 is organized in a similar, manner /to ^ 
code frame cache directory 110. i.e.. it is organized ^^20 
to allow.an 8-way set-asspciatiye^search; uy-j^^rni\:^,A^r<^ 

Referring ^,I=IG. J1,,locai-,framef ach^ 
tory 133^Jnc!udes.: for, example. Jhirtyrtwg^rgws jar?d^^ 
eight columns and each column^ and 
tipn includeis;;an, entry 43^ Epcti^ntry 13^ 
a local frame^^ddre^ |^cU38. g^,^; 

Local Jrame^addressc^tag .136 Js. for ^^xarrTpje,^^^ 
upper thirtyrfi;/e bits,pf the^ m-i 
er 96 and is iu;sed inydeter^ 

of local.frame cache physical pointei^J^ field^^Jao 
138 is a three-bit field, , used Jn indjicaUog the state 
of a particular local frame J 31 ^ithin :local frame., 
cache 90. A, local ; frame within local frame cajch^^^^^ 
may have one pf,.th§4ollowing states:*^! mmrtml a^c^n 

(a) ,An empty sUte^which;iSi:defiried J3y an yn-,,,^ 35 
successful attempt wit^ a- processing, element ^ , 
to riocate a , particular^ locaL frf n;»e A^t^'"/:*^^!?^ v. 
frame cache 90. This state is valid for a Jocal 
frame .on the nnain storage Jree frame, list a^^^^ for.^^ 
one iwhicf^resides^er^ ^%^^o 
iSi,allpcated to a process.^ Jhe empty^state^ "^^V ^^^v^ 
also be detected when a castout from the local 
frame cache to main storage is in progress for 

the referenced local frame, resulting in the ^c- 

tual inpage being delayed until castout comple- 45 

tion. The empty state is recorded throughout 

local frame cache at system initialization and 

within local frame 131 in the cache whenever an 

attempted local frame inpage replacing a frame 

in cache is aborted. . so 

(b) A transient state which applies to a local 
frame when it is in a state of motion, e.g. mov- 
ing from main storage to the local frame cache 
(a local frame inpage). During the local frame 
inpage. the state of the inpaging frame is re- 55 
corded within local frame cache state 98. During 
inpage. one of the transient states is recorded 

for the frame depending upon the desired final 



state pf local frame. 131 at mpage, completion.: 
The .final state may be present for a ^ ; 
pretest/prefetch inpage (explained further below) 
or pinned for a pretest/prefetch inpage with in- 
vocation context map entry pinning control 37 
active. The transient state in the local frame 
cache prevents selectipn of local frame 131 by a 
local frame cache replacement algorithm (LRU), 
thereby allowing eventual connpietioh of the in- 7" 
page operation. Ttiis allows completion of any ' ^^" ^ 
castout associafeid'W^ inpage3 *''^ ■ 

(c) A^free siatB^^ibh^i^ 

frame in the local frame ckche which is currently 
not allocated to any process. As one example, a 
local frame enters this state through process 
termination. 

(d) A present state which indicates that the ' 
contents of local frame 131 are entirely within' 
the local frame cache.; When the local frame is ' 
within this state, the contents are available for 
access by an instruction , within the processing 
element. , . , , .. ; ; . 

(e) A pinned state w^ indicates that the j:^^^;^ 
contents of ^tha'desir^ locai^frama^are entirely^ ' 

. within the local frame cache. However, if aMocar^^ .- 
.frame.is: marked; as, pinned^nfien^^;^r^ 
the fram^ f by^^ pretes^^^ 
(described below): In order to remove a pinned \ 
1^ ^i^t t^^^^ fr/^rrt ^oWftwaro ar-tinn i*5 to' v-">^' 



ioca^f rame ' f ronn ''the ca^^ is to 
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Address tag 136 is used in conjunction with 
local frame pointer 96 to determme the m 
value, of local frame pointer 97. In 

particular- ;,the fiW.' rightmost bi)^^^^^^^^ Iriame ^ ^ 

pointer 96 are used to index into one the rows 
within locai frame 'cache directory 1^^^^ ; 
to obtaining a F>articujar^rpwr^^^^ 

local frame address tag'iSfe' wffhin "^f 
IS conapared. against the Value of b^ i 
■ logical Ideal 1rame:>ddr^ss (base 'address of|^ the " 
local .frame'i'njm^^^^ ' 
then' the address value of Ibcat frame bache' phys- 
ical pointer 97 is obtained. In particular, the ad- 
dress of the poihter is equal ^ the row identifier 
(i.e./ the five rightmost bits of local frame pointer 
96) and column identifier, which is the binary repre- 
sentation of the column (i.e.. columns 0-7) in which 
the match was found. 

Subsequent to determining local frame cache 
physical pointer 97. the physical pointer is used 
along with source operand 0 specifier 128 and 
index 106 (i.e., the index is used if the addressing 
mode indicates that index addressing is to be 
used) to select a datum 132 from local frame 
cache 90 representative of a source operand 0 to 
be used during execution of an instruction. That is, 
a frame offset 140 of source operand 0 specifier 
128 (as previously described, each specifier in- 
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eludes^, an .^addressing mode (AM), state function 
(SF) and fraim is added at 142^^ i 

and then, local frame cache physicar pointer 97 is 
appended on the left of the summation to indicate 
a particular datum (e.g./ source operand 0) within 5 
the local frame cache. ^ ^ 

Similarly, local frame cache physical pointer 97 
is used with source operand J . specifier 130 and 
index 106 to ^select a^datum 132 from^ frame 
cache 90 representiatiye of a* source, operand 1 also io 
to be used during instruction execution. In particu- 
lar. a frame offset 1,44 of source operand 1 sp®?'- 
fier 130. is .added . at ,146 to index j1 06 
local frame cache physical pointer 97 is appended 
on the left of the summation to indicate a particular 75 
datum (e.g., source operand 1) Whin the local 
frame cache,. ^ . 

In addition to the atjove, local frame cache 
physical,, pointer ,.97 is also , used with destination 
specifier .126 and index 106 (again, if the index is 20 
to be^ijsed) to select a datum 132 from local frame 
cache 90 representative of the location *Vithin"^^^^^^ 
local ^frame cache Ja which.^e,g.. me^resuit of 'the ; 
instruction execution is to be stored. In particular, a 
frame, offset; 147 ,of .destination specifier 126 is 25 

-added at J49 to, index v1 06 and 4hen. local irame . . 

^ - Ji^- ii\^anuxkjx^^y \^amj,^mm mp^mm^jn^ 
cache physica pointer, 97 is, appended, on the .left ; 

of the. summation, to indicate a particular datum^ ^ 

(e.q., a resultJocation),withm the local frame cache, \ 

Associated with each datum stored in locaP 30 
frame.xache 90 is. a 3-bit state indicator 148Vio- ' 
cated in state bit cache 91 T Similar to local frame 
cachei90. state bit cache ,91 .mclud^ , 
256 locations (152) and each location includes 256 
3-bit state indicators 148. In one embodiment, state 35 
bit cache 91 is organized .into eight word-wide 
banks ...accessible in parallel. Each^ location 152 
spans across /all eight b^^^ ^^^^'^ J ' 

stores ;thirty:two words of location 1 52. (The or-' 
ganizati'on. ,of ' the state bircache is slmilar'^tdHhe ^ 40 
organization. jpf local frame cache 90. as described 
in detail atDOve.) In accordance ' with the present 
invention, state indicators 148 are in paged from 
main storage to state bit cache ,91 , (i.e., state field 
28 of data entry 48 is copied) In parallel with the 45 
copying of datum 50 to local frame cache 90. 

The state bits are loaded into or read from the 
state bit cache in a manner similar to that de- 
scribed above for the local frame cache. In particu- 
lar, as shown in FIGS. 10a, 10b. each of the so 
addresses obtained (e.g., by appending the local 
frame cache physical pointer on the left of the 
summation of the particular frame offset and the 
index, if needed) and used to select a datum 132 
(either a source operand or a destination location) 65 
from local frame cache 90 is also used to select an 
associated state bit indicator 146 from state bit 
cache 91. Each state bit indicator 148 represents 



the current state of a particular datum. A particular 
datum and' its associated state '' bit indicator are 
selected in parallef from local' fram 
state bit cache 91. respectively, using the process 
described above. However, it will be apparent to 
one of ordinary skill in the' art that the state bit 
cache may be orgainized in a number of ways and 
that only one embodiment is described herein. It 
will also be apparent to one of ordinary skill in the 
art that it is also possible ^to eliminate the state bit ^ 
cache and^'piace the state bit indicators within the 
local frarWe tache, e.g.? adjacent to its associated ; • ^ 
datum^^^^^ J^Jotey^lq '^vnof^ .s■^^^• -^r^ci- ^^^^ ■t,^'uv- ^ ^jc-v-nr 

State^Mt - indicators 'may have a number of 
states (as one example, empty, waiting or present) 
and when an operand and an associated state 
indicator are selected from local frame cache 90 
and state bit dache 91, respectively, the next state 
for each "selected operand is determined. In addi- 
tion, whefi a result is to be written to a location, the " ^ 
next state for that location is determined. In one 
embodiment, in order to'determine the next state of - 
an operand oT^Presuiriofc^a^^ plurality ^f state ' 
transition 'IMes'^and '^a state '-function ^ssocjiated 




State ^function -^56 ^for sbCirce operarid''b*^sp&ifer^ 
128 anH'S sta^'functicMi^tEfe^^ Ofiferahd' l^a^' 

specifieri3^^EachoFtKS'si^^^^ 
' indicate ' tfie'^-^imchrp^^ 
abovef ■associateci'''^wiTh ^ "rt^^ specifier and 

each state function is used as an address into a - 
state transition table?ln one enribodimehtr^there is a ' >^ 
state transition tabfe"^ for ^each -specifi^r.'^That is, V^*" 
thereMs a 'btate- transition ^ table' '1 
destinatlbn specifier 126; a" state' 'transition table ^ 
1 62 associated'^ith' s^^ =6perand'0 'specifier 1 28 ' 
and a ''state ''transition table 164 associated with 
source 6perand"'l specifier i 30.^ 'Iiocated '^within ' 
WK'af tfi^ state'^'transitbri^teles^ is^lan'^ e'ntry^^^'^1 65 
whicK 'inciudes'ihe >os states '^1B6 ^ for^ 

each of the possible state functions.' For example, if 
state function"'! 54 'represents a synchronizing func- 
tion "olf 'One-Time' P^^^ Is/lultiple Gohsijmer. ' 
then located within state transition table 160 (state 
function 154 indexes into state transition table 160) 
is entry 165 including the possible next states for 
that synchronizing function. In one example, each 
entry may include eight possible next states. Fur- 
ther, if state function 154 could represent another 
synchronizing function, such as Multiple Producer, 
Single Consumer, then there would be another 
entry within state transition table 160 containing the 
possible next states for that synchronizing function. 
Similarly, state transition tables 162 and 164 in- 
clude entries 165 which contain the possible next 
states 166. Each state transition table is, e.g. lo- 
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cated, within processing element 14 and may to 
Statically altered at .system initialization in any , 
known manner to include additional entries of next 
states which support further synchronizing func- 
tions. 5 

As shown in FIG. 10b, associated with each 
next state 166 is a 3-bit control flag 168. Control 
flag 168 is set at system initialization and is fixed 
for its associated next state. Control flag 168 |s ' 
used in indicating ^to the prociessing element which^ ' ' 70 
action is to be taken for. the thread which includes 
instruction NaO^ That is^ control 'flag;;i 68'indic£rtes; ' '^'^ 
for instance, whether execution of the thread is to ^ 
be continued or whether execution is to be de- 
ferred (explained below). IS 

Referring to FIG- 12, in operation, a process to 
be executed is dispatched by interconnection net-^ ' ' 
work 18 to one of processing elements 14, STEP 
180 "Dispatch Process." ^Subsequent to receiving 
the dispatched process, a decision is made within 20 
the processing element as to whether the process 
is to be .placed on the invocation context queue 
located within the main memory contror unit which ^ 
is associated with the particular processing element 



more threads ^associated with the process.^^are en- <> v.- 
queueW *6htb -local cbhtih'ili^a^^^ queue ''86. STEP^ ' m 
191 "Place Thread on ICQ." Subsequently, in or- 
der to indicate, that there are valid entries in the ^ 
local continuation queue for the -process on the 
ready queue, local continuation queue head 38 and 
tail 40 are copied from invocation context queue 26 
to local continuation queue head 102 and tail 103 
located in ready queue entry 94 designated for that 

When a process is; placed on the ready queue, * 
ready *^queu¥^ state -95 "located within ready queue mj-^j 
entry 94 is updated from empty to pending pretest. - 
STEP 1 92 "RQ State is Updated." 

Referring back to INQUIRY 184. should a pro- 
cess be enqueued onto ready queue 84 in a last 
in-first out fashion, theri^ the process is enqueued 
onto the head of the ready queue with the possibil- 
ity of replacing ia valid ready queue entry 94 at the 
tail of the ready queue. STEP 190 "Enqueue onto 
Ready Queue." Once again when the process is ■ " 
addeS to ^he^ready queuepthreads fol^ that process ' ^i? 
are plaKeid^Bn^local coiitinuatioh queue 86^^STEP^^^^ >^^^ 
191 "Piac6 threaS bit liCQ" and 
^^5 is^CipdaM^to'^^ 
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Incoming process, . ■ X ^ w / : 

In particular, in decidj to place the ^^^^^^ 

processran initial inquiry is Thade'^as to wheth^ 
process is to be enqueued on ready queue W in a ^ 30 
first 'in-firsVput'S 

Enqueued FIFO?" Should the' process be ' en- ' 
queued in* a first in-first out manner, then a; check - 
is made tio see if the ready queue 1s full and ' 
therefore., cannot accept any niore processes. IN- 
QUIRY l86 ^F^ady ;Queue' R^^^ If ' the ready ' ' 
queue is'full. th^ process is placed onto the tail -^^ 
end -pf the invocation context queue in main stor-' ' ' 
age until a position is available in the ready queue. ;^ 
STEP 188 "Enqueue onto ICQ." When placing a^^^ 
process on the tai tof the invocation context queue, 
invocation context queue backward pointer 44 lo- 
cated within invocation context map eritry 26 of the 
process being added is replaced with the current 
value of the invocation context queue tail. In addi- 45 
tton, invocation context queue forward pointer 42 of 
the last process identified by the old tail is updated 
to indicate the new tail of the invocation context 
queue, which is equal to the local frame pointer of 
the process being added. Further, the invocation so 
context queue tail is set equal to the local frame 
pointer of the process being added. STEP 189 
"Update ICQT and ICQH." 

Returning to INQUIRY 186. if. however, the 
ready queue is not full, then the process is added 55 
to the tail end of ready queue 84. STEP 190 
"Enqueue onto the Ready Queue." In addition to 
loading the process onto the ready queue, one or 



in nrt^rf storage^S^Whe^^ to ' the head oKthe^l?^^ 

invocation^ dontext-'que^^^^ 
^^forward pc:HrTter^2^fb^^ is updated 

to point^tS^the^'olcl head^^f the Mnvocation context ^ 
queue- In additiori.Mnvocatiori context queue back- 
ward pointer ^44^ of the old'head is updated to point ' 
to the:''pnDcessVbeing-l^ded^^(^^^ IPP^' frame- ^ 
poinl(Br)?1=urthei?tHe^inv6Gatioh c^ 
is updated to point to the new process represehteid 
by the local frame pointer. Also, ^local continuation 
queue head 102 and tail 103 are copied from ready 
queue entry 94 to local continuation queue head 38 
' - and tiaii 40 in invocation context 'map entry 26. 
— M^'^feivioosly^^m 

added to the ready queue, the state of the " ready 
queue entry is updated from empty to pending 
pretest. During pending pretest, the availability of 
the resources required for execution of the process 
is determined. INQUIRY 194 "Are Resources Avail- 
able?" In particular, code frame cache 88 is 
checked to see whether code frame 108 as in- 
dicated by code frame pointer 99 in ready queue 
entry 94 is located within the code frame cache. 
Similarly, local frame cache 90 is checked to deter- 
mine if local frame 131 as indicated by local frame 
pointer 96 in ready queue entry 94 is located within 
the local frame cache. Should it be determined that 
code frame 108 or local frame 131 is not present 
within its respective cache and. therefore, is not 
available to the process during processing, the 
missing code frame and/or local frame is inpaged 
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from main -storage :,and thereby^ made^.,ayailaWe/ 
STEP /-196 "Prefetch: Resources." . In .-^^ particular, . ^ 
code frame 108 is copied from code frame 22 in 
main storage to code frame cache 88 (inpaging). 
Further, local frame 131 is copied from datum 50 5 
located within local frame 20 in main memory 
control units 12 to local frame cache 90 and in 
parallel, state indicator 28 which is associated with 
the datum is inpaged from main memory, cpntroL -.^ 
units 12 (i.e.. local frame 20) to state bit cache 91, /o 
The moving of data between, main memory control 
units .and one or JTipre^caches allpyys^fpr^thej nurr^- ,^ 
ber of processes, tand threads which can^ be ex- 
ecuted by the processing element to be bound 
only by the size of main storage and not by a finite is 
amount of local storage. During inpaging. ready 
queue state 95 is updated from pending pretest to 
prefetch active, STEP 198 "Update RQ State.". 

Subsequent to inpaging the resources during 
prefetch or if an affirmative response is obtained 20 
from INQUIRY 194. ready queue state 95 is up- 
dated from prefetch active.to ready Jndicating that 
the process Js ready for execution by the, process- . _ 
ing element. STEF:^200^Upda^ jr; 
ready) process m^v^be. exepute^ : j^^s 

;elementvy*^hent^h!3jprp^ 
entry vin oready-^uefue^r84.vjyy^ 
selected i^fooiiBxecyitiQrTi ^the^opp.tlroac^ 
local continusftipn-queue JB6.,is ^seJected.j,.SJEf j^2(^^^^^ 
"Select r Process ;and .(Thread;"^ Whenj this ^^occurs 
ready^^queue rstate, 95 f is updated fro%,rea(;ly«^ito ^ . 
running. STEP ,204 "Update RQ State," In addition. , 
the state of the^previous running ready queue(entry 
is changed from running to empty, ready^or^s^^^ 
ing (all ..of which: are. :described ;aboy ^ic-^s 
on the conditions for which it relinquishes control of . 
processing within the. processing,^lement. ^^.^ ^,, . 
The -selected thread (or local continuation „ 
queue entry ?1P4) from local continuation queue 86 
includes code offset -105 which is use^ 
scribed above, in selecting^an instruction >1 20 4^ 
code frame cache 88 to be executed, STEP 206 
"Fetch Instruction." When instruction 120 is 
fetched., local, continuation queue head pointer 102 
located in ready queue entry 94 is adjusted to 45 
indicate the removal of the processing thread from 
the local continuation queue. STEP 208 "Adjust 
LCQ Head." 

As described above, the instruction which is 
selected includes source operand 0 specifier 128 so 
which is used to select datum 132 representative of 
a source operand 0 from local frame cache 90 and 
its associated state bit 148 from state bit cache 91. 
Also, source operand 1 specifier 130 is used to 
select datum 132 representative of a source 55 
operand 1 from local frame cache 90 and its asso- 
ciated state bit 148 located in state bit cache 91. 
STEP 210 "Select Data and State Bits." 



In addition to the,at>oye, state functions 156 
and 158 located;^|n source o specifier 128'; 

and source operand 1 specifier^ 130, respectively 
are used in selecting a number of possible next 
states 166 from state transition tables 162, 164. In 
particular, state function 156 is used as an address 
into state transition table 162 to select entry 165 
which includes the next states for source operand 0 
specifier. Similarly, state function 158 is used as an 
address into state trahsitiipn' table 1 64 Jo select 
entry 1 65 which includes the next states for source 
operand 1 , specif ierl'^STEP 212 "Select* Possible 
Next States." .(As-descrjbed above, each state 
function is representative of a synchronizing func- 
tion and the states associated with each synchro- 
nizing function are included in the state transition 

tables!)'' / . ' \ 

Subsequent , to . selecting the possible next 
states Jor a source operand., the current state (state 
indicator 148) of the opeirand is used in choosing 
one state from the possible 'next ,sta^^^ 
presents the nek stke for thait 6pprand. For exam-* 
pie. Jf .there^ ,ar^^^ight :r^xytate^^ the; 
value of state bit ' indicator 148 is zero, then the 
^,next ■ itate Ifor ■ stS^:m^ lo: 




tie^ eight. 

states). STEP 214^::Determine;Neoct Sta^^ 
embodiment, for a : particular synchronizing func- 
Jion, it. may be that state bit indicator 148 repre- 
sents^ a. present; , state .^fpr^^an p^ which has 
been Vead and the possible next ; stiates for a par- 
ticular synchronizing function are empty, waiting 
and ipresent. In one example, the next state to be 
selected for that operand is the present state. After ^ 
the niext state is deterimined. state indicator ,1 48 is 
updated to the value ^ of the next state, .e.g., by 
writing the value of "the next state into the' current 
state value located in state bit cache 91, STEP 216 

"Update Current'State."ClVr"; r C;.,. - 

In addition ,to the above, .a determination is 
made as to the course of action to be taken by the 
thread which includes the instruction being ex- 
ecuted, STEP^ 218 "Determine Action to be Taken," 
Types of actions which may be taken include, for 
instance, continue with thread execution, suspend 
current thread and awaken deferred thread, each of 
which are explained below. 

In one embodiment, in order to determine what 
action is to be taken by the thread, an inquiry is 
made into whether the data (e.g.. source operand 0 
and/or source operand 1) located in local frame 
cache 90 and selected by executing instruction 120 
is available for use by the instruction, INQUIRY 220 
"Is Data Available?" In particular, this determination 
is made by checking state indicator 148 (before it 
is updated to the next state) associated with each 
of source operands 0 and 1. Should state indicator 
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148 indicate, Jpr-instOTce.^^^^ 
empty state, then that operand is considered un- 
available. If, however,, state indicator 148 indicates 
that each operand is in. for example, a present 
state, then the operands are considered available. 
If the data is available, then execution of the thread 
continues and the result of the executing instruction 
is stored in a result location within local frame 
cache 90, STEP 222 "Continue Execution," as'de- 
scribed in detail tierein. ;y r ; : 

In pne example, instruction? a/e executed with; 
in execution unit .92. .which is .coupled to local 
frame cache 90 within processing element 14. Ad- 
dressing mode 78 of each source operand specifier 
located in instruction 120 gates the appropriate 
data, such as source . operand 0 and source 
operand 1 (which has. been obtained as described 
above), into input registers (not shown) of execu- 
tion unit 92. Execution unit 92 executes the instruc- 
tion using , the obtained operands and places the 
result in .a destination .j(orj,reS;Ult)jjipcatio located 
within Ipqai Jrame^ache 9^ ^^^estirlaj 
tion specifier 126 of instruction 120, ^If^^.howe^^ 
the result of the instruction execution is a branch to 
^ a specific (pcati^ 

^;a nevy jcompresseicorvtm^ ' 
enqueued onto local xorninuation^^ 
threadjis fprrjhe^^^ame, 0ocess that 
being executed) or a new thread may be hand 
bv interconnection netwo/k .18 and enqueued onto 
a different process* localxontinuation queue.; 

On the other hand, if the answer to INQUIRY 
220 is in ,tha negative and one or more oj the 
source operands are -not , available, (e.g.., the. state 
indicator-associated with that operand indicates the 
operand is ;not in a present. state), then execution of 
the thread associated with the executing instruction 
is deferred. ..STEP 224 "Defer Execution of 
Thread,; Xln:.one,exanriple,.the particular .instr;uction 
continues executing, but the results are not stored.) 
In particular, if source operand 0 or source operand 
1 is in, for example, a state of empty or waiting and 
therefore, unavailable (if both operands are unavail- 
able, then in one embodiment, operand zero is 
preferred over operand one), then the thread cur- 
rently executing (represented by code offset 105 
and index 106 in local continuation queue entry 
104 within local continuation queue 86) is sus- 
pended until source operand 0 and source operand 
1 (if both are needed) are available. When suspen- 
sion occurs, any affects of the instruction are nulli- 
fied. 

In order to suspend execution of a thread, code 
offset 105 and index 106 (also referred to as the 
compressed continuation descriptor) located within 
the local continuation queue are stored in the da- 
tum location (or field) representative of the unavail- 
able source operand. Each datum 132 may receive 



a nurnt)er of compressed continuation descriptors 
corrGsponcling'%^^ of threiads;^ (h one ex- v 

ample, each datum may store fPur compressed 
continuation descriptors. 

5 When data is to be written to a datum location 

132 within local frame cache 90, the result location 
and its associated state indicator are specified, as 
described in detail above, by frame offset 147 of 
destination specifier 126 located within code frame 

70 ''fcache 88,' local -frame cache physical pointer 97 

' and any corresponding index 106. STEP 226 "Data 
is to be Written" (FIG. '13) In addition to selecting 
the Ibcatiori' and the state indicator, state function 
154 located in destination specifier 126 is used as 

75 an address into state transition table 160 to select 
a number of possible next states 166 for the result 
location (similar to the procedure described above 
for selecting- the next state for ' the source 
operands). As described above, subsequent to se- 

20 lecting the possible next states for the result loca- 
tion, the current state indicator 1 48 for that location 
is used chobse'the' next ' statie' for the location. 
The current state indicator Ts theri' updated to re- 
flecf the Value of the next state/^^^- ^ft'^bf v 
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niade^tcr^HetlT^^ 
INQuim^ 228'^ls^i:o^ 
to be written. %"^reiad/writ^^ 
location is initiafly re^d^^tofdetermine^^^ 
30 stored there 'befdr^ xjata^^ 

Should chosen datum 132 be'empty (as indicated 
by state indicator 148 before it is updated to the 
next state)^ then the data is written- to that location, 
STEP 230 "Write Data. "-On the other hand. if the 
35 location'' is^ not emptVf'a'd^te^^ ' 
to whether there is one or more compressed con- 
tinuation descriptors stored within the location and. 
therefore, the location is in a waiting state (again, 
as indicated by state indicator 148), STEP '231 "Is 
40 - Location in ^Waiting State?" If the- location; tis not in 
^ -the waiting^ st^te?-theh^*the^^ 
234 "Write Data." If, however; that location is in a 
waiting state, then any and all compressed continu- 
ation descriptors stored in that location are re- 
45 moved and enqueued onto the local continuation 
queue associated with the running process before 
the data is written, STEP 232 "Awaken Com- 
pressed Continuation Descriptors." Subsequent to 
removing the compressed continuation descriptors. 
50 the data is written to the indicated location. STEP 
234 "Write Data." 

In one specific embodiment, each next state 
resident within state transition tables 160. 162. 164 
has an associated 3-bit control flag 168 which Is 
55 retrieved when the possible next states are re- 
trieved. When one of the next states is selected, as 
described above, the associated control flag 168 is 
also selected and is used to indicate what action is 
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to be taken by the processing element. That is. the 
control flag indicates^ for example^ whether^thr^^ , 
execution is to be continued, whether execution of 
the thread is to be deferred or whether a deferred 
thread is to be awakened. Each of these actions is 
performed in the manner described above. 

Although preferred embodiments have been 
depicted and described in detail herein, it will be 
apparent to those skilled in the relevant art that 
various modifications, additions, substitutions and 
the like can be made without departing from the 
spirit\of the invention and these are itherefore con-^. 
sidered to be within the scope of the invention as 
defined in the following claims. 

Claims r . -v: , - ^ ■ 

1, A method for synchronizing execution by a 
processing element of threads within a pro- , 
cess, said method comprising the steps of: 
executing a thread within a process;, 
fetching during said thread, execution, frorh a 
local frame cache. a datum field;, f.:..,. tr-^.,/, . 
fetching from a state .bit cache a ^tate indica-,^^ 
tor.'^isaidistateiryndicatorj^ as^q^ialecj^wiffi 
said:datumvfieldiiand:^havingta first state yalu , 
determinihgvbased ,rpn .said ^^irst^ state value 
whether^ saiditjdatumv^jield^^iocjudes ,a . 
available Jor use , by sajd threa^^ r.^- ,,..^ 

deferring : execution ;pf ;said thread ..when 
datum is unavailable.- 
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3- 



5. 



The method of . claim 1. further including the ^ 
step of determining ajsecond .state for . said 
.state indicator,: said . ; second^^ state , replaci ng 
said first state during said thread execution.^ , 

.The method of claim 2. wherein said second 
state determining step includes the steps of: 
selecting rfrom. said ,threaci,^a stale function to 
be usec|^^nvdeterminmg said second.state; , , 
using said state function to select from one of 
a plurality of tables N possible second states 
for said indicator; and i% j v. - 

using said first state to choose from said se- 
lected N possible states said second state for 
said indicator. 

The method of claim 3. wherein said first state 
represents a current state of said datum and 
said second state represents a next state of 
said datum. 

The method of one of claims 1 to 4, wherein 
said thread is represented by a continuation 
descriptor and said deferring step includes the 
step of storing said continuation descriptor 
within said datum field. 
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6. The method of claim 5. wherein said continu- 
ation descriptor 'is compressed bef6re't>eirig - 
stored in sai9 datum field and said datum field ^ : - 
can receive a plurality of said compressed 
continuation descriptors. 

7, The method of claim 6. further including the 
step of awakening said deferred thread when 
datum is available for said thread. 

S, The method of claim 7. wherein' said awaken- ^ ; 
ing step includes the step of ^removing said 
compressecl 'continuation descriptors from said - 
datum field. ' 

9. The method of claim 8, wherein said removed 
continuation descriptors are stored on a queue ^- 
local to said processing element. 

10. The method of claim 8 or 9. wherein^ said 
awakening step further includes the step of 
storing said available datum in said datum field -^;^ 
when said compressed continuation descrip- i!.> 
tors are removed. ' ^ r-.' '^''^pe^^. fio^^ 

' 11. ::A^>-^iEfetrri5;fpr^i^ 



local ' frarne 'cacher sai6^l6ca 
^including a d^tum jield; ^^'''' ^ ] '^O^'M'r^^^^^^^ ' 
a state . bit cache, said state bit cache including A 
a state indicator corresponding to said datum 
field; ' ■ ' ^. : : . : ; - w 



means for executing' by said processing ele- ' 



ment a thread withih a pracessf '^^^ ^^^^^^^ 
r'rneariS for fetching from said local frame cache ' 
said datum field and from said state bit cache 
said state'indicator having a first state value; 
means for determining based on said first state 
value whether said datum field inclijdes a da-f|J 
tum available for' use by said thread; and - ^^oc 
m'e4ns for ^deferring execution ^6f said thread ■ 
when said datum is unavailable. 

12, The systern of claim 11/ further comprising 
means for determining a second state for said 
state indicator, said second state replacing 
said first state during said thread execution. 

13. The system of claim 12. wherein said second 
state determining means comprises: 

means for selecting from said thread a state 
function to be used in determining said second 
state; 

a plurality of tables each having N possible 
second states for said indicator; 
means for using said state function to select 
from one of said plurality of tables said N 
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possible second states;^and 0 l- 
means for using said first state to choose from 
said selected N possible states said second 
state for said indicator. 

14. The system of claim 13. further comprising a 
main storage, said main storage comprising a 
copy of said datum, field and a copy of said 
state indicator, said copy" of "said "datum field 
being copied from main storage to said local 
frame cache and said copy of said -state in-^ 
dicator being copied from said main stbVage to* 
said state bit cache. 

15. A method for synchronizing execution by a 
processing element of threads within a pro- 
cess, each of said threads including a plurality 
of instructions, said method comprising the 
steps of: ^ . . ^ 

executing an instruction of said thread; 
fetching di^ring ^ said ^instruction execution from 
a local framed cache; at I least one source 
operand and from a state bit cache at least 
one state indicator having a first state valup. 
i said at least .one state indicator corresponding 
■ to said at least one source operand; ; 
fetching from said instruction at least one state 
function associated with ' said at least one 
fetched source operand; 
using said at least one state function to select 
from one of a plurality of tables N possible 
second states for said at least.one state indica- 
tor, each of said second states having a cor- 
responding flag indicator; ' 
using said first state to choose from said;'^se- 
lected N possible states a second state for 
said state indicator; ; ^' A 

replacing said first state with said second state 
during said thread. execution; and r T 
having said thread perform otjie o| a p^lurality of 
actions after said instruction execution. r said 
action being specified by said flag' indicator 
associated with said chosen second state. 



16. The method of claim 15. wherein said plurality 
of actions includes the following actions: con- 
tinuing execution of said thread, deferring ex- 
ecution of said thread and awakening a de- 
ferred thread. 

17. The method of claim 16, wherein said thread is 
represented by a continuation descriptor and 
said deferring execution action includes the 
step of storing said continuation descriptor 
within a source operand. 

18. The method of claim 17, wherein said continu- 
ation descriptor is compressed before being 
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Stored and said source operand can receive a 
plurality of compressed continuation descrip- 
tors. 



19, 



The method of claim 18, wherein said awaken- 
ing action includes the step of removing a 
compressed continuation descriptor from a 
source operand. 



20. The method of claim 19. wherein said awaken- 
_ ing action further includes the step of storing a 
1 ^^^sourceloperancl iwhen said compressed, con- 
tinuation descriptor is removed. 
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The method of claim 20, wherein said removed 
continuation descriptor is stored on a queue 
local to said processing element. 
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A system" for ^'Synchronizing yeixecution by a 
processing element of threads within a pro- 
cess, each of said threads- including a plurality 
of instructions, saiid system comprising: ; 
means for executing an instruction of said 

.. thread;::. ...v^vH..^ ir\^.S^-''^-^^^^- - ■ 

a^iocal frame cach^S^ saidl Iddal Jrarne cache 
including a pluralityrof soun^ opprand^ 
a state bit cache, said state bit'cache ; 
a plurality of state indicators, each of isaid state 
indicators having a ' first state v^^ 
wherein one of said state indicators corre- 
sponds to one of said source operands; 
means for fetching during said instruction ex- 
ecution from'^said; local frame cache at least 
one source operand and from said state bit 
cache at least one corresponding state indica- 
tor; " ' ," ■ 

means vfor fetching from said instruction at 
least one state function associated with said at 

* least one source operand; 

I a^p!ura^tvJ ^^f tables each having N possible 
% seconS ^states fo^^ indicators; 
means for using said at least one state function 
to select from one of said plurality of tables 
said N possible second states, each of said 
second states having an associated flag indica- 
tor; 

means for using said first state to choose from 
said selected N possible states a second state 
for said state indicator; 

means for replacing said first state with said 
second state; and 

means for having said thread perform one of a 
plurality of actions, said action being specified 
by said flag indicator associated with said cho- 
sen second state. 
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